Fitting GP to lightcurve 428_…in Stan
Notebook outlining the fitting of GP to thunderKAT lightcurve ID$ 428_…
1 Light Curve
- The light curve has \(N =\) 21 observations over a range of 147.57 days.
- Observations are evenly spread over the time range.
- The shortest gap between observations is 5 days.
- The longest gap between observations is 14.68 days.
- The mean flux density is \(\bar{y} =\) 0.19 Jy.
- The mean standard error in the observations is 0.000491 Jy.
- The observational noise is very small relative to the brightness of the observations.
2 SE Basic Model
- Zero constant mean function.
- Squared Exponential kernel function.
- Homoskedastic noise.
- Wide prior on observational noise, uninformed by observational noise estimates.
\[y \sim \mathcal{N}(f(x), \sigma_\textrm{noise}^2)\]
\[f \sim \mathcal{GP}(\boldsymbol{0}, k(x, x'))\]
\[k(x,x') = \eta^2 \exp\left\{ -\frac{1}{2}\frac{(x - x')^2}{\ell^2}\right\}\]
\[\ell \sim \mathrm{InvGamma}(5,5)\]
\[\eta \sim \mathcal{N}^+(0,1)\]
\[\sigma_\textrm{noise} \sim \mathcal{N}^+(0,1)\]
2.1 MCMC Results
variable mean median sd mad q5 q95
eta 0.184588 0.152648 0.113476 0.065523 0.083235 0.394774
ell 150.580662 132.664000 78.972809 60.348195 63.226605 295.029000
sigma 0.006285 0.006116 0.001137 0.001059 0.004713 0.008371
rhat ess_bulk ess_tail
0.999966 2961 2328
1.001321 2922 2491
1.000792 3212 2674
2.2 MCMC Plots
2.3 Posterior Predictive Samples
The fitted model has a very long lengthscale, comparable to the length of the observational window. The estimated observational noise has a standard deviation more than an order of magnitude of that in the original data. The combination of these parameters has lead to a very smooth fit that passes through the middle of the observed data points rather than through any datapoints themselves.
2.4 PSD
3 SE Observational Errors Model
- Zero constant mean function.
- Squared exponential kernel function.
- Heteroskedastic (Gaussian) noise.
- Incorporate data on error in observations of each \(y_i\).
\[y_i \sim \mathcal{N}(f(x_i), \sigma_i^2)\]
\[f \sim \mathcal{GP}(\boldsymbol{0}, k(x, x'))\]
\[k(x,x') = \eta^2 \exp\left\{ -\frac{1}{2}\frac{(x - x')^2}{\ell^2}\right\}\]
\[\ell \sim \mathrm{InvGamma}(5,5)\]
\[\eta \sim \mathcal{N}^+(0,1)\]
\[\sigma_i \sim \mathcal{N}^+(\textrm{stderr}(y_i), \mathrm{Var}(\textrm{stderr}(\boldsymbol{y})))\]
3.1 MCMC Results
variable mean median sd mad q5 q95 rhat
eta 0.139203 0.135184 0.027459 0.025415 0.102757 0.189866 0.999570
ell 11.188619 11.204300 0.493513 0.472801 10.334480 11.964810 1.001436
sigma[1] 0.000393 0.000393 0.000042 0.000042 0.000326 0.000461 1.004134
ess_bulk ess_tail
6179 3134
5518 3015
8251 2818
3.2 MCMC Plots
3.3 Posterior Predictive Samples
By including the observed observational errors for setting priors on the Gaussian noise of each observation, the fitted median passes through each of the observed points.
3.4 PSD
4 SE Non-zero flat mean function Model
- Constant mean function, learned from observations.
- Weak prior on mean function intercept.
- Squared exponential kernel function.
- Heteroskedastic (Gaussian) noise.
- Incorporate data on error in observations of each \(y_i\).
\[y_i \sim \mathcal{N}(f(x_i), \sigma_i^2)\]
\[f \sim \mathcal{GP}(C, k(x, x'))\]
\[C \sim \mathcal{U}[-1,1]\]
\[k(x,x') = \eta^2 \exp\left\{ -\frac{1}{2}\frac{(x - x')^2}{\ell^2}\right\}\]
\[\ell \sim \mathrm{InvGamma}(5,5)\]
\[\eta \sim \mathcal{N}^+(0,1)\]
\[\sigma_i \sim \mathcal{N}^+(\textrm{stderr}(y_i), \mathrm{Var}(\textrm{stderr}(\boldsymbol{y})))\]
4.1 MCMC Results
variable mean median sd mad q5 q95 rhat
eta 0.007895 0.007729 0.001376 0.001309 0.005962 0.010359 1.000380
ell 1.551013 1.128900 1.205045 0.561561 0.545507 4.463428 1.001012
C 0.190395 0.190399 0.001818 0.001800 0.187497 0.193298 0.999727
sigma[1] 0.000393 0.000393 0.000042 0.000042 0.000323 0.000462 1.004526
ess_bulk ess_tail
4796 2712
2350 1496
5823 2642
5758 2553
4.2 MCMC Plots
4.3 Posterior Predictive Samples
4.4 PSD
5 SE Fixed constant mean function Model
- Constant mean function set at fixed value, e.g., mean of observations.
- Squared exponential kernel function.
- Heteroskedastic (Gaussian) noise.
- Incorporate data on error in observations of each \(y_i\).
\[y_i \sim \mathcal{N}(f(x_i), \sigma_i^2)\]
\[f \sim \mathcal{GP}(C, k(x, x')),\quad C \in \mathbb{R}\]
\[k(x,x') = \eta^2 \exp\left\{ -\frac{1}{2}\frac{(x - x')^2}{\ell^2}\right\}\]
\[\ell \sim \mathrm{InvGamma}(5,5)\]
\[\eta \sim \mathcal{N}^+(0,1)\]
\[\sigma_i \sim \mathcal{N}^+(\textrm{stderr}(y_i), \mathrm{Var}(\textrm{stderr}(\boldsymbol{y})))\]
5.1 Mean Function = 0.2
variable mean median sd mad q5 q95 rhat
eta 0.012121 0.011760 0.002476 0.002219 0.008833 0.017134 1.015402
ell 5.533317 6.360585 2.324641 1.453578 0.807717 8.088779 1.011678
sigma[1] 0.000394 0.000394 0.000039 0.000037 0.000325 0.000456 1.016685
ess_bulk ess_tail
151 195
175 114
134 295
5.2 Mean Function = 0.195
variable mean median sd mad q5 q95 rhat
eta 0.009070 0.008887 0.001540 0.001421 0.006912 0.011873 1.002037
ell 2.037204 1.246580 1.786842 0.719271 0.579282 6.118782 1.007532
sigma[1] 0.000394 0.000394 0.000041 0.000040 0.000326 0.000461 1.000229
ess_bulk ess_tail
3000 1893
858 1132
4124 2236
5.3 Mean Function = 0.18
variable mean median sd mad q5 q95 rhat
eta 0.012620 0.012264 0.002507 0.002241 0.009247 0.017257 1.002786
ell 5.164802 6.002850 2.369866 1.771892 0.800258 7.872616 1.008966
sigma[1] 0.000395 0.000395 0.000041 0.000041 0.000327 0.000461 1.003632
ess_bulk ess_tail
1908 1545
481 1453
3343 2409
6 Matern 3/2 kernel
- Matern 3/2 covariance kernel
- Zero constant mean function
\[y_i \sim \mathcal{N}(f(x_i), \sigma_i^2)\]
\[f \sim \mathcal{GP}(\boldsymbol{0}, k(x, x'))\]
\[k(x,x') = \eta^2 \left( 1 + \frac{\sqrt{3(x - x')^2}}{\ell}\right) \exp\left\{ -\frac{\sqrt{3(x - x')^2}}{\ell}\right\}\]
\[\ell \sim \mathrm{InvGamma}(5,5)\]
\[\eta \sim \mathcal{N}^+(0,1)\]
\[\sigma_i \sim \mathcal{N}^+(\textrm{stderr}(y_i), \mathrm{Var}(\textrm{stderr}(\boldsymbol{y})))\]
6.1 MCMC Results
variable mean median sd mad q5 q95 rhat
eta 0.154295 0.138354 0.064650 0.043151 0.090575 0.273144 1.000380
ell 77.002830 72.413850 22.854942 18.158514 49.429170 120.076800 1.000103
sigma[1] 0.000393 0.000393 0.000041 0.000041 0.000327 0.000460 1.000717
ess_bulk ess_tail
4108 2402
3854 2180
7848 2658
6.2 MCMC Plots
6.3 Posterior Predictive Samples
6.4 PSD
7 SE + Matern 3/2 additive kernel
- Sum of squared exponential and Matern 3/2 kernels
- single output scale (marginal variance) hyperparameter
- zero constant mean function
\[y_i \sim \mathcal{N}(f(x_i), \sigma_i^2)\]
\[f \sim \mathcal{GP}(\boldsymbol{0}, k(x, x'))\]
\[k(x,x') = \eta^2 \left[ \exp\left\{ -\frac{1}{2}\frac{(x - x')^2}{\ell_\mathrm{SE}^2}\right\} + \left( 1 + \frac{\sqrt{3(x - x')^2}}{\ell_\mathrm{M}}\right) \exp\left\{ -\frac{\sqrt{3(x - x')^2}}{\ell_\mathrm{M}}\right\} \right]\]
\[\ell_\mathrm{SE} \sim \mathrm{InvGamma}(5,5)\]
\[\ell_\mathrm{M} \sim \mathrm{InvGamma}(5,5)\]
\[\eta \sim \mathcal{N}^+(0,1)\]
\[\sigma_i \sim \mathcal{N}^+(\textrm{stderr}(y_i), \mathrm{Var}(\textrm{stderr}(\boldsymbol{y})))\]
7.1 MCMC Results
variable mean median sd mad q5 q95 rhat
eta 0.010996 0.008687 0.008222 0.004604 0.003972 0.025317 1.000166
ell_SE 40.137106 34.952250 21.529021 13.219603 19.576910 75.923105 1.000316
ell_M 54.647213 52.715750 14.041654 12.820265 35.660370 80.404210 1.000264
sigma[1] 0.000394 0.000394 0.000041 0.000039 0.000327 0.000461 0.999895
ess_bulk ess_tail
3901 2446
4500 2262
3855 2547
6845 2694
7.2 MCMC Plots
7.3 Posterior Predictive Samples
7.4 PSD
8 SE x Matern 3/2 multiplicative kernel
\[y_i \sim \mathcal{N}(f(x_i), \sigma_i^2)\]
\[f \sim \mathcal{GP}(\boldsymbol{0}, k(x, x'))\]
\[k(x,x') = \eta^2 \exp\left\{ -\frac{1}{2}\frac{(x - x')^2}{\ell_\mathrm{SE}^2}\right\}\left( 1 + \frac{\sqrt{3(x - x')^2}}{\ell_\mathrm{M}}\right) \exp\left\{ -\frac{\sqrt{3(x - x')^2}}{\ell_\mathrm{M}}\right\}\]
\[\ell_\mathrm{SE} \sim \mathrm{InvGamma}(5,5)\]
\[\ell_\mathrm{M} \sim \mathrm{InvGamma}(5,5)\]
\[\eta \sim \mathcal{N}^+(0,1)\]
\[\sigma_i \sim \mathcal{N}^+(\textrm{stderr}(y_i), \mathrm{Var}(\textrm{stderr}(\boldsymbol{y})))\]
8.1 MCMC Results
variable mean median sd mad q5 q95 rhat
eta 0.034323 0.029664 0.024104 0.018622 0.010214 0.072489 1.161106
ell_SE 34.832319 35.641000 32.745772 48.002919 0.663607 91.153940 1.649258
ell_M 48.562943 59.813300 40.049847 46.078022 0.677478 107.348000 1.658186
sigma[1] 0.000393 0.000393 0.000041 0.000041 0.000327 0.000461 0.999732
f_star[1] 0.183917 0.183925 0.000397 0.000393 0.183261 0.184568 1.000160
ess_bulk ess_tail
16 754
6 53
6 53
10380 5488
7894 8134
8.2 MCMC Plots
8.3 Posterior Predictive Samples
8.4 PSD
9 Matern 3/2 + QP kernel
\[y_i \sim \mathcal{N}(f(x_i), \sigma_i^2)\]
\[f \sim \mathcal{GP}(\boldsymbol{0}, k(x, x'))\]
\[k(x,x') = \eta^2 \left[ \exp\left\{ -\frac{2 \sin^2\left( \pi\frac{\sqrt{(x - x')^2}}{T}\right)}{\ell_\mathrm{P}^2}\right\} + \exp\left\{ -\frac{1}{2}\frac{(x - x')^2}{\ell_\mathrm{SE}^2}\right\} + \left( 1 + \frac{\sqrt{3(x - x')^2}}{\ell_\mathrm{M}}\right) \exp\left\{ -\frac{\sqrt{3(x - x')^2}}{\ell_\mathrm{M}}\right\} \right]\]
\[\ell_\mathrm{P} \sim \mathrm{InvGamma}(5,5)\]
\[\ell_\mathrm{SE} \sim \mathrm{InvGamma}(5,5)\]
\[\ell_\mathrm{M} \sim \mathrm{InvGamma}(5,5)\]
\[\eta \sim \mathcal{N}^+(0,1)\]
\[T \sim \mathcal{U}[\textrm{minimum gap in x}, \textrm{range of x}]\]
\[\sigma_i \sim \mathcal{N}^+(\textrm{stderr}(y_i), \mathrm{Var}(\textrm{stderr}(\boldsymbol{y})))\]
9.1 MCMC Results
variable mean median sd mad q5 q95 rhat ess_bulk
eta 0.0042 0.0034 0.0029 0.0017 0.0017 0.0090 1.0005 5446
ell_SE 26.9376 23.8059 14.0313 8.5754 13.2651 50.5492 1.0047 1101
ell_M 37.3853 36.4702 10.6089 9.2197 23.1298 55.3967 1.0047 1010
ell_P 2.4831 2.1587 1.2520 0.8721 1.1937 4.8006 1.0009 7707
T 110.2745 115.1965 27.0775 28.7397 59.7368 144.7272 1.0013 4654
ess_tail
4873
423
409
4854
2420